Branch-and-Bound Reconstruction of Ancestral Sequences
نویسندگان
چکیده
The problem of ancestral sequence reconstruction is the statistical inference of sequences that correspond to internal nodes in a phylgenetic tree [1]. Joint reconstruction is the task of seeking the most likely set of ancestral states corresponding to all the ancestral taxa, while marginal reconstruction aims at inferring the sequence in a specific internal node. In simple probabilistic models of evolution, both tasks can be performed efficiently using dynamic programing [3, 1]. The situation is more complicated in more detailed models of evolution, such as models with among-site-rate-variation (ASRV). In these models, one assume that the rate of evolution can vary among different sites. This is modeled by introducing a latent quantity that models the rate at each site. Maximum likelihood (ML) models incorporating ASRV are statistically superior to those assuming among site rate homogeneity [2]. For example, it was shown that strong support for rodent nonmonophyly results from systematic error associated with the oversimplified assumption of homogeneity [4]. Currently, no efficient algorithm exists for joint ancestral reconstruction in ASRV models. In particular, dynamic programing approaches fail in these models. In this work we devise a branch-and-bound algorithm for joint ancestral reconstruction under ASRV and show that it can find the most likely reconstruction for large phylogenies.
منابع مشابه
2 0 Ja n 20 10 On the inference of large phylogenies with long branches : How long is too long ? ∗
The accurate reconstruction of phylogenies from short molecular sequences is an important problem in computational biology. Recent work has highlighted deep connections between sequence-length requirements for highprobability phylogeny reconstruction and the related problem of the estimation of ancestral sequences. In [Daskalakis et al.’09], building on the work of [Mossel’04], a tight sequence...
متن کاملOn the inference of large phylogenies with long branches: How long is too long?
The accurate reconstruction of phylogenies from short molecular sequences is an important problem in computational biology. Recent work has highlighted deep connections between sequence-length requirements for high-probability phylogeny reconstruction and the related problem of the estimation of ancestral sequences. In Daskalakis et al. (in Probab. Theory Relat. Fields 2010), building on the wo...
متن کاملA branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites: Application to the evolution of five gene families
MOTIVATION We developed an algorithm to reconstruct ancestral sequences, taking into account the rate variation among sites of the protein sequences. Our algorithm maximizes the joint probability of the ancestral sequences, assuming that the rate is gamma distributed among sites. Our algorithm probably finds the global maximum. The use of 'joint' reconstruction is motivated by studies that use ...
متن کاملA branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites
Motivation: We developed an algorithm to reconstruct ancestral sequences, taking into
متن کاملGeneTRACE - Reconstruction of Gene Content of Ancestral Species
While current computational methods allow the reconstruction of individual ancestral protein sequences, reconstruction of complete gene content of ancestral species is not yet an established task. In this paper, we describe GENETRACE, an efficient linear-time algorithm that allows the reconstruction of evolutionary history of individual protein families as well as the complete gene content of a...
متن کامل